This is a panic download of occupywallst.org as of 2012-08-22. I grabed all images that are linked on occupywallst.org too. Images from i.imgur.com, imgur.com and 2439-occupywallst-com.voxcdn.com are captured into other warc.gz and .tar.gz archives.
Identifier occupywallst.org-20120822-mirrorAddeddate 2012-08-24 02:31:35Creator occupywallst.orgMediatype webDate 2012Year 2012Identifier-access http://archive.org/details/occupywallst.org-20120822-mirrorIdentifier-ark ark:/13960/t07w7kw9mImagecount 89643Repub_state 4Publicdate 2012-08-24 02:45:55Firstfiledate 20120822234915Lastfiledate 20120823164524Scandate 20120822234915
NOTE1: It started very late on 2012-08-22 and continued into 2012-08-23.
NOTE2: that 2439-occupywallst-com.voxcdn.com is the same as occupywallst.org.
August 30, 2012
There is a tool called warctozip to covert warc to a zip. http://warctozip.archive.org/
If you want to look at the source code for it you can go here: https://github.com/alard/warctozip-service
August 28, 2012
Question about .warc files
how can we extract the content of .warc files ?
Is there a software, an utility ?
I looked on Google but it's very complicated, you must intall servers, java, other stuff so complicated, but this is if you want to BROWSE the archived website.
But I only want to extract some of the files, not all the html pages or any other meta information ...